Bring OCR to Mealie for importing scanned recipes #1244

Miroito · 2022-05-19T17:22:53Z

Added so far

New tab in the recipe creation page for scanned recipes (I'm open for the icon that should be used)
~~As a first draft, the recipe is created with the recognized text in the description field to copy and paste later in edit mode.~~ no more
2022-07-28 Update: I'm at a point where I think the component is very usable and open for reviews to merge something that is very close to what is already implemented.

Before merging

Here the list of tasks before we consider acceptable to merge this code in the beta

Nice to have's

Define the canvas variable pointing to the canvas html element only once instead of every function in the ocr-editor component
Add advanced settings (e.g. Language to improve pytesseract recognition)
Possibility to add multiple pages/files and switch between them
Clean up the ocr-editor page by creating a RecipeOcrEditor component
Automatic field filling suggestion
- Recipe title

Design

This new feature is based on previous experience with a similar software solution called Esker.
The process that I have designed for now lets the user use a new creation page /recipe/create/ocr letting them upload a picture, optionally making it the recipe thumbnail. This creates a recipe called "New OCR Recipe" with the uploaded picture as an asset called "Original recipe image". Additionally, a new column in the recipes table registers that this recipe is an OCR recipe.

The user is directed to the page "recipe/_slug/ocr-editor" where they can use the image they uploaded to fill the usual recipe fields on the right part of the page. When this page in mounted, it sends the asset name to the backend fot it to send back the text and contained inside and its position.

Two modes are available.

Selection mode, lets the user input data.
Pan and Zoom mode lets the user move around when pictures are big enough to do so.

In selection mode, the user can draw a rectangle, the identified text will appear under the canvas. The user can then select any recipe field on the right, then click anywhere inside the rectangle. This will take whatever text is fully contained in the rectangle and overwrite the field that was last selected.

The bulk add buttons will spawn a dialog with the selected text (understand text under the drawn rectangle) inside them.
This is where the Split text modes come into play, it lets the user choose whether they want to keep all line breaks, for example, if a recipe book lists one ingredient per line, they are able to select the whole list, press bulk add on the ingredient tab and add all ingredients in 2 clicks.

The mode flatten will remove all line breaks and the blocks mode will put line breaks between identified blocks by tesseract. The blocks mode is pretty useful for instructions, that usually come into multiple paragraphs in a form of blocks, making it easier to use the bulk add dialog, this time for instructions.

For recipes that are called New OCR Recipe (n) or regex /New\sOCR\sRecipe(\s\([0-9]+\))?/g, the ocr-editor component will take the biggest block with the fewer words, assume it is the recipe's title, and populate it in the recipe name field. This is done with the function findRecipeTitle in the ocr-editor component.

When the user is happy with the edits the recipe can be saved the usual way.They can come back to the OCR editor page by clicking the usual edit button and using the new button "OCR Editor" that will appear when the recipe is an OCR Recipe (hence the new table column).

hay-kot · 2022-05-22T19:53:26Z

FYI Rebasing should get CI sorted.

Need the changes from #1252

.pre-commit-config.yaml

hay-kot · 2022-07-28T22:31:21Z

Problem with your CI was your poetry lock file

Now it's failing for other reasons 😄

Miroito · 2022-07-29T13:36:24Z

Thanks I missed this line in the feedback.

Yes sounds like the type check is pretty angry at me for being sloppy. I'll get it to work.

hay-kot

These are just my cursory review comments. I haven't done a thorough review and there will likely be more changes required to get this merged in.

Before I dig too deep into this I would like to see a more thorough write-up of the feature overall, how it works.

Some critical areas I see that need more documentation are frontend/pages/recipe/_slug/ocr-editor.vue and some context around the tests that you've written like what is being tested and how we can validate the skipped test if the output is some-what unreliable.

Looks really cool so far, thanks for your work on this one!

.pre-commit-config.yaml

frontend/api/class-interfaces/ocr.ts

mealie/schema/recipe/recipe.py

Miroito · 2022-08-09T07:12:17Z

Added a design section to explain all the relevant work that make this possible hopefully the block of text is not too hard to read.
I want to add a limitations section later on to explain my thoughts on the current implementation and what can be done to make it more stable, reliable and functionally richer.

Miroito · 2022-08-16T16:07:51Z

Rebased to fix merge conflicts with mealie-next branch

Miroito · 2022-08-19T17:46:37Z

I have tried using the feature to add a bunch of recipes to see how it does. And... It's outputting gibberish when jpg files include a lot of text which is the worse user experience imaginable. There is either a lot a pre-processing to do that would make it much slower than it is currently.
Or, at least temporarily, I could either restrict the files to png or convert all images to png when they are uploded.

A little bit disappointed with tesseract, I'm going to invesitgate further as to why it is behaving this way,

Putting the PR temporarily in draft again.

Miroito · 2022-09-03T16:19:01Z

I can't keep rebasing this amount of changes everytime so I have marked the PR up for review. Feedback would be greatly appreciated so we can merge this as soon as possible.

hay-kot · 2022-09-03T18:42:47Z

I have tried using the feature to add a bunch of recipes to see how it does. And... It's outputting gibberish when jpg files include a lot of text which is the worse user experience imaginable. There is either a lot a pre-processing to do that would make it much slower than it is currently.
Or, at least temporarily, I could either restrict the files to png or convert all images to png when they are uploded.

A little bit disappointed with tesseract, I'm going to invesitgate further as to why it is behaving this way,

Putting the PR temporarily in draft again.

Any updates on this comment?

Miroito · 2022-09-03T19:33:18Z

I have tried using the feature to add a bunch of recipes to see how it does. And... It's outputting gibberish when jpg files include a lot of text which is the worse user experience imaginable. There is either a lot a pre-processing to do that would make it much slower than it is currently.
Or, at least temporarily, I could either restrict the files to png or convert all images to png when they are uploded.
A little bit disappointed with tesseract, I'm going to invesitgate further as to why it is behaving this way,
Putting the PR temporarily in draft again.

Any updates on this comment?

For now, I have restricted the image format to png, I have found no help on Tesseract's side though I did not look too far.

There is also the option of converting any input image to png, then using the png to do the ocr. This adds a huge overhead that would mean I could not afford to ask the server every time to recognize characters in the image when the ocr editor is loaded like it is now.
I think in the long term, it is worth taking a little time to create the recipe and storing every ocr info the database. Creating the recipe the first time will take more time, but overall the recognition can probably be improved by preprocessing and the long processing would happen only once.
You might ask why I did not start by implementing it this way first; it seemed easier to do it the current way, to bring the feature "fast" to the main branch, the rest of the work can be done later as the current implementation is not blocking further improvements of the above nature and this is already a good base.
I used it already a few times these past weeks by taking pictures of my books, creating the recipe on my dev instance and downloading the zip file to upload it to my usual mealie instance, I think it is already more than usable, improvements can come later with user feedback as well.

hay-kot

I tried to go through and clean up what was left to get it merged, but there was too much going on in the Vue component for me to dig though and fix everything. Maybe when I have some more time I can go through it and clean up it, but I left some comments on the issues that I'm seeing

The biggest problem with the component is that there is just so much going on and it's difficult to group what belongs together. What I was going to was break the functions and state that go together into separate composable and place those in a different file to allow for logical group of items - maybe related but there were so many typescript errors in VSCode from Volar that it made reviewing extremely difficult, not sure what the issue was there.

btw, you've got a __init.py__ file in the mealie/services/ocr that needs to be fixed.

Like I said, I tried to get it to a good point, but it was more of a weekend project. Will hopefully take another crack at it this weekend if you don't get to it first.

frontend/pages/recipe/_slug/ocr-editor.vue

Miroito · 2022-09-22T07:21:13Z

I tried to go through and clean up what was left to get it merged, but there was too much going on in the Vue component for me to dig though and fix everything. Maybe when I have some more time I can go through it and clean up it, but I left some comments on the issues that I'm seeing
The biggest problem with the component is that there is just so much going on and it's difficult to group what belongs together. What I was going to was break the functions and state that go together into separate composable and place those in a different file to allow for logical group of items - maybe related but there were so many typescript errors in VSCode from Volar that it made reviewing extremely difficult, not sure what the issue was there.

I can do the clean up if you think it should be done before it is merged. One of the reasons I did not do it yet is that actually, most of the script mess are event handlers for the canvas, which means that if I move the canvas to its own component for example, most of the current functions will follow hence just moving the mess. Most of it is math or helper functions to prevent code duplication. I'll try to make it look as best as I can make it look and you let me know what you think.

Least I can do is give it a try, it will be easier for me to clean up since I wrote all those things.

btw, you've got a __init.py__ file in the mealie/services/ocr that needs to be fixed.
Lol yes nice catch, I don't really know what to write inside so I left it empty.

A more general note: Volar is complaining only in the template about the recipe returned by useRecipe because the Recipe type has all fields optional. Honestly this is a more general issue that even the new recipe page is not compliant with. The component prop asks for a NoUndefinedFields<Recipe> but pages/recipe/_slug/index.vue gives it a Recipe anyway. There is just one error reported instead of multiple since the RecipePage component assumes everything is there. This issue alone makes it very difficult to write a proper page using a recipe.
Maybe it would be nicer to have a set of fields that a recipe must have, and a set of truly optional properties that we can check on a case by case basis. I mean even the assets (or ingredients, or instructions) property could be an empty array rather than sending back undefined, the existence of undefined and null is pure pain...

hay-kot · 2022-09-25T23:01:46Z

Merged as apart of #1670 with a few minor things cleaned up. Thanks for sticking with this one until the end. 🎉

Miroito force-pushed the ocr branch from 2e9e20c to cdf450d Compare May 19, 2022 17:42

Miroito force-pushed the ocr branch from cdf450d to 0f068b5 Compare May 23, 2022 08:04

Miroito force-pushed the ocr branch 3 times, most recently from 5863804 to 986c068 Compare July 14, 2022 09:53

Miroito commented Jul 16, 2022

View reviewed changes

.pre-commit-config.yaml Outdated Show resolved Hide resolved

Miroito force-pushed the ocr branch from 1c2318a to 12c2d78 Compare July 18, 2022 10:07

Miroito marked this pull request as ready for review July 28, 2022 17:49

Miroito force-pushed the ocr branch from 7837af5 to 1fa937c Compare July 30, 2022 12:03

Miroito changed the title ~~Bring OCR to Mealie for importing scanned recipes~~ Draft: Bring OCR to Mealie for importing scanned recipes Aug 5, 2022

Miroito changed the title ~~Draft: Bring OCR to Mealie for importing scanned recipes~~ Draft:Bring OCR to Mealie for importing scanned recipes Aug 5, 2022

Miroito changed the title ~~Draft:Bring OCR to Mealie for importing scanned recipes~~ Bring OCR to Mealie for importing scanned recipes Aug 5, 2022

Miroito marked this pull request as draft August 5, 2022 17:46

Miroito force-pushed the ocr branch from 08481ff to 10f7680 Compare August 8, 2022 06:43

Miroito marked this pull request as ready for review August 8, 2022 17:04

hay-kot reviewed Aug 9, 2022

View reviewed changes

.pre-commit-config.yaml Outdated Show resolved Hide resolved

frontend/api/class-interfaces/ocr.ts Outdated Show resolved Hide resolved

mealie/schema/recipe/recipe.py Show resolved Hide resolved

Miroito force-pushed the ocr branch 2 times, most recently from 8e82e38 to a1bdaf4 Compare August 16, 2022 16:07

Miroito marked this pull request as draft August 19, 2022 17:47

Miroito force-pushed the ocr branch from 67c00af to 394f1c4 Compare September 3, 2022 16:17

Miroito marked this pull request as ready for review September 3, 2022 16:17

Miroito added 11 commits September 11, 2022 21:50

Fix type and class initialization

c260dbf

Add multi-language support

b38ae66

Highlight words in mount

17cad03

Fix image ratio bug

3731746

Better ocr creation page

fc7221c

Revert awkward feature to scroll in Selection mode

35f4049

Rebasing alembic migrations sux

965c85c

Remove obsolete getShared function

e7f15ef

Add function docstring

7941a76

Move down ocr creation option

0a8412c

Make toolbar icons more generic

0215ec5

Miroito force-pushed the ocr branch from 08e5c46 to 0215ec5 Compare September 12, 2022 11:47

Show help at the bottom of the page

ec27d35

Miroito requested a review from hay-kot September 16, 2022 17:18

hay-kot mentioned this pull request Sep 22, 2022

Pr/miroito-1244 #1662

Closed

hay-kot reviewed Sep 22, 2022

View reviewed changes

frontend/pages/recipe/_slug/ocr-editor.vue Outdated Show resolved Hide resolved

frontend/pages/recipe/_slug/ocr-editor.vue Outdated Show resolved Hide resolved

frontend/pages/recipe/_slug/ocr-editor.vue Outdated Show resolved Hide resolved

Miroito added 8 commits September 22, 2022 12:53

move ocr types to own file

53f13ed

Use template ref for the canvas

7907548

Use i18n.tc to get strings directly

a5d08b2

Correct naming mistake

734ade1

Move Ocr editor to own directory

27d19cb

Create Ocr Editor parts

3ff376d

Safeguard recipe properties access

ac9da82

Add loading frontend animation due to longer request time

ad59df2

Miroito force-pushed the ocr branch from e6c720a to ad59df2 Compare September 25, 2022 09:07

hay-kot mentioned this pull request Sep 25, 2022

feat (WIP): bring png OCR scanning support #1670

Merged

hay-kot closed this Sep 25, 2022

michael-genson mentioned this pull request Jun 1, 2023

[Nightly] - OCR Editor Fails to Copy Text to Selected Field in Instructions #2394

Closed

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bring OCR to Mealie for importing scanned recipes #1244

Bring OCR to Mealie for importing scanned recipes #1244

Miroito commented May 19, 2022 •

edited

Loading

hay-kot commented May 22, 2022 •

edited

Loading

hay-kot commented Jul 28, 2022

Miroito commented Jul 29, 2022

hay-kot left a comment

Miroito commented Aug 9, 2022

Miroito commented Aug 16, 2022 •

edited

Loading

Miroito commented Aug 19, 2022 •

edited

Loading

Miroito commented Sep 3, 2022

hay-kot commented Sep 3, 2022

Miroito commented Sep 3, 2022

hay-kot left a comment

Miroito commented Sep 22, 2022

hay-kot commented Sep 25, 2022

Bring OCR to Mealie for importing scanned recipes #1244

Bring OCR to Mealie for importing scanned recipes #1244

Conversation

Miroito commented May 19, 2022 • edited Loading

Added so far

Before merging

Nice to have's

Design

hay-kot commented May 22, 2022 • edited Loading

hay-kot commented Jul 28, 2022

Miroito commented Jul 29, 2022

hay-kot left a comment

Choose a reason for hiding this comment

Miroito commented Aug 9, 2022

Miroito commented Aug 16, 2022 • edited Loading

Miroito commented Aug 19, 2022 • edited Loading

Miroito commented Sep 3, 2022

hay-kot commented Sep 3, 2022

Miroito commented Sep 3, 2022

hay-kot left a comment

Choose a reason for hiding this comment

Miroito commented Sep 22, 2022

hay-kot commented Sep 25, 2022

Miroito commented May 19, 2022 •

edited

Loading

hay-kot commented May 22, 2022 •

edited

Loading

Miroito commented Aug 16, 2022 •

edited

Loading

Miroito commented Aug 19, 2022 •

edited

Loading